Intital Package Installations

First, you will need to install some packages:

You will only need to install them once, so do not worry about running this after you do it the first time.

Our Data

We are going to make it easy on ourselves and use some of the data that R has built into it. We are going use the “diamonds” data from the ggplot2 package.

Data Exploration

Data exploration is an important part of the statistical programming process. We can start by taking a peak at our actual data. This could also be helpful for people who may want to take a look at our document.

We often like to take an initial look at the correlations within our data. This can be most enlightening, but dealing with a massive correlation matrix can be a hassle. Instead, we can go the visual route. Before we produce our visualization, we need to do a little data work. The diamonds data contains a few variables that would need some work before being thrown into a correlation matrix. We will, instead, just work with the purely numeric variables.

Data Visualization

Static

Often times, we might just want a nice static visualization. Static, however, does not mean simple. R gives us the ability to create layers within our visualizations. Likely the most popular package for static visualizations is ggplot2.

Interactive

There are a wide variety of packages designed for interactive visualizations in R. We already saw a few of the htmlwidget packages in use. Since we already have a ggplot created, we can just throw it to the ggplotly function from the plotly package.

Results

We can do a lot of different things with our results. For example, we might want some of our results to actually be within our text – there exists a significant relationship between carats and price (r = 0.92).

We might want to include some more expanded results.

The following is our standard regression output, but in a nice interactive table:

We might also want some model comparisons:

Dependent variable:
price
(1) (2)
carat 7,756.426*** 7,765.141***
(14.067) (14.009)
depth -102.165***
(4.635)
Constant -2,256.361*** 4,045.333***
(13.055) (286.205)
Observations 53,940 53,940
R2 0.849 0.851
Adjusted R2 0.849 0.851
Residual Std. Error 1,548.562 (df = 53938) 1,541.649 (df = 53937)
F Statistic 304,050.900*** (df = 1; 53938) 153,634.800*** (df = 2; 53937)
Note: p<0.1; p<0.05; p<0.01